AITopics

Country:

North America > United States (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsFeb-16-2026, 11:46:04 GMT

ac01e21bb14609416760f790dd8966ae-Paper-Datasets_and_Benchmarks.pdf

artificial intelligence, data mining, machine learning, (15 more...)

Country:

North America > United States (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Neural Information Processing SystemsOct-9-2025, 04:26:08 GMT

A benchmark of categorical encoders for binary classification

artificial intelligence, encoder, machine learning, (14 more...)

Country:

North America > United States (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsOct-9-2025, 04:26:05 GMT

ac01e21bb14609416760f790dd8966ae-Paper-Datasets_and_Benchmarks.pdf

artificial intelligence, data mining, machine learning, (15 more...)

Country:

North America > United States (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Aledo, Juan A., Gámez, José A., Rosete, Alejandro

A consensus set for the aggregation of partial rankings: the case of the Optimal Set of Bucket Orders Problem

arXiv.org Artificial IntelligenceFeb-19-2025

In rank aggregation problems (RAP), the solution is usually a consensus ranking that generalizes a set of input orderings. There are different variants that differ not only in terms of the type of rankings that are used as input and output, but also in terms of the objective function employed to evaluate the quality of the desired output ranking. In contrast, in some machine learning tasks (e.g. subgroup discovery) or multimodal optimization tasks, attention is devoted to obtaining several models/results to account for the diversity in the input data or across the search landscape. Thus, in this paper we propose to provide, as the solution to an RAP, a set of rankings to better explain the preferences expressed in the input orderings. We exemplify our proposal through the Optimal Bucket Order Problem (OBOP), an RAP which consists in finding a single consensus ranking (with ties) that generalizes a set of input rankings codified as a precedence matrix. To address this, we introduce the Optimal Set of Bucket Orders Problem (OSBOP), a generalization of the OBOP that aims to produce not a single ranking as output but a set of consensus rankings. Experimental results are presented to illustrate this proposal, showing how, by providing a set of consensus rankings, the fitness of the solution significantly improves with respect to the one of the original OBOP, without losing comprehensibility.

bucket order, obop, osbop 2, (15 more...)

2502.13769

Country:

North America > Cuba (0.04)
Europe > Spain > Castilla-La Mancha (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Matteucci, Federico, Arzamasov, Vadim, Boehm, Klemens

A benchmark of categorical encoders for binary classification

arXiv.org Artificial IntelligenceNov-20-2023

Categorical encoders transform categorical features into numerical representations that are indispensable for a wide range of machine learning models. Existing encoder benchmark studies lack generalizability because of their limited choice of (1) encoders, (2) experimental factors, and (3) datasets. Additionally, inconsistencies arise from the adoption of varying aggregation strategies. This paper is the most comprehensive benchmark of categorical encoders to date, including an extensive evaluation of 32 configurations of encoders from diverse families, with 36 combinations of experimental factors, and on 50 datasets. The study shows the profound influence of dataset selection, experimental factors, and aggregation strategies on the benchmark's conclusions -- aspects disregarded in previous encoder benchmarks.

aggregation strategy, dataset, encoder, (13 more...)

2307.09191

Country:

North America > United States (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Cachel, Kathleen, Rundensteiner, Elke, Harrison, Lane

MANI-Rank: Multiple Attribute and Intersectional Group Fairness for Consensus Ranking

arXiv.org Artificial IntelligenceJul-20-2022

Combining the preferences of many rankers into one single consensus ranking is critical for consequential applications from hiring and admissions to lending. While group fairness has been extensively studied for classification, group fairness in rankings and in particular rank aggregation remains in its infancy. Recent work introduced the concept of fair rank aggregation for combining rankings but restricted to the case when candidates have a single binary protected attribute, i.e., they fall into two groups only. Yet it remains an open problem how to create a consensus ranking that represents the preferences of all rankers while ensuring fair treatment for candidates with multiple protected attributes such as gender, race, and nationality. In this work, we are the first to define and solve this open Multi-attribute Fair Consensus Ranking (MFCR) problem. As a foundation, we design novel group fairness criteria for rankings, called MANI-RANK, ensuring fair treatment of groups defined by individual protected attributes and their intersection. Leveraging the MANI-RANK criteria, we develop a series of algorithms that for the first time tackle the MFCR problem. Our experimental study with a rich variety of consensus scenarios demonstrates our MFCR methodology is the only approach to achieve both intersectional and protected attribute fairness while also representing the preferences expressed through many base rankings. Our real-world case study on merit scholarships illustrates the effectiveness of our MFCR methods to mitigate bias across multiple protected attributes and their intersections. This is an extended version of "MANI-Rank: Multiple Attribute and Intersectional Group Fairness for Consensus Ranking", to appear in ICDE 2022.

base ranking, consensus ranking, fairness, (10 more...)

2207.1002

Country:

Asia > Middle East > Oman (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Education (1.00)
Government (0.93)
Law > Civil Rights & Constitutional Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)

Pearce, Michael, Erosheva, Elena A.

A Unified Statistical Learning Model for Rankings and Scores with Application to Grant Panel Review

arXiv.org Machine LearningJan-7-2022

Rankings and scores are two common data types used by judges to express preferences and/or perceptions of quality in a collection of objects. Numerous models exist to study data of each type separately, but no unified statistical model captures both data types simultaneously without first performing data conversion. We propose the Mallows-Binomial model to close this gap, which combines a Mallows' φ ranking model with Binomial score models through shared parameters that quantify object quality, a consensus ranking, and the level of consensus between judges. We propose an efficient tree-search algorithm to calculate the exact MLE of model parameters, study statistical properties of the model both analytically and through simulation, and apply our model to real data from an instance of grant panel review that collected both scores and partial rankings. Furthermore, we demonstrate how model outputs can be used to rank objects with confidence. The proposed model is shown to sensibly combine information from both scores and rankings to quantify object quality and measure consensus with appropriate levels of statistical uncertainty. Keywords: preference learning, score and ranking aggregation, Mallows' model, A* algorithm, peer review

algorithm, mallow-binomial model, ranking, (13 more...)

arXiv.org Machine Learning

2201.02539

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Gilbert, Hugo, Portoleau, Tom, Spanjaard, Olivier

Beyond Pairwise Comparisons in Social Choice: A Setwise Kemeny Aggregation Problem

arXiv.org Artificial IntelligenceNov-14-2019

Rank aggregation aims at producing a single ranking from a co llection of rankings of a fixed set of alternatives. In social choice theory (e.g., Moulin 1991), where the alternatives are candidates to an election and each ranking represents the preferences o f a voter, aggregation rules are called Social Welfare Functions (SWFs). Apart from social choice, rank aggregation has prov ed useful in many applications, including preference learning (Cheng a nd H ullermeier, 2009; Cl emen con et al., 2018), collaborative filtering (Wang et al., 2014), genetic map creation (Jackson et al., 2008), similarity search in databases systems (Fagin et al., 2003) and design of web search engines (Altman and Tennenholtz, 2008; Dwork et al., 2001). In the fo llowing, we use interchangeably the terms "input rankings" and "preferences", "output rank ing" and "consensus ranking", as well as "alternatives" and "'candidates". The well-known Arrow's impossibility theorem states that t here exists no aggregation rule satisfying a small set of desirable properties (Arrow, 1950). In the absense of an "ideal" rule, various aggregation rules have been proposed and studied. F ollowing Fishburn's classification (1977), we can distinguish between the SWFs for which the out put ranking can be computed from the majority graph alone, those for which the output ranking can be computed fro m the 1 Table 1: Results of setwise contests in Example 1. set c

consensus ranking, disagreement, kemeny rule, (13 more...)

1911.06226

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > North Carolina (0.04)
(9 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Information Management > Search (0.88)

Achab, Mastane, Korba, Anna, Clémençon, Stephan

Dimensionality Reduction and (Bucket) Ranking: a Mass Transportation Approach

arXiv.org Machine LearningOct-15-2018

Whereas most dimensionality reduction techniques (e.g. PCA, ICA, NMF) for multivariate data essentially rely on linear algebra to a certain extent, summarizing ranking data, viewed as realizations of a random permutation $\Sigma$ on a set of items indexed by $i\in \{1,\ldots,\; n\}$, is a great statistical challenge, due to the absence of vector space structure for the set of permutations $\mathfrak{S}_n$. It is the goal of this article to develop an original framework for possibly reducing the number of parameters required to describe the distribution of a statistical population composed of rankings/permutations, on the premise that the collection of items under study can be partitioned into subsets/buckets, such that, with high probability, items in a certain bucket are either all ranked higher or else all ranked lower than items in another bucket. In this context, $\Sigma$'s distribution can be hopefully represented in a sparse manner by a bucket distribution, i.e. a bucket ordering plus the ranking distributions within each bucket. More precisely, we introduce a dedicated distortion measure, based on a mass transportation metric, in order to quantify the accuracy of such representations. The performance of buckets minimizing an empirical version of the distortion is investigated through a rate bound analysis. Complexity penalization techniques are also considered to select the shape of a bucket order with minimum expected distortion. Beyond theoretical concepts and results, numerical experiments on real ranking data are displayed in order to provide empirical evidence of the relevance of the approach promoted.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1810.06291

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)